Large-Scale Sparse Kernel Logistic Regression — with a comparative study on optimization algorithms
نویسندگان
چکیده
Kernel Logistic Regression (KLR) is a powerful probabilistic classification tool, but its training and testing both suffer from severe computational bottlenecks when used with large-scale data. Traditionally, L1-penalty is used to induce sparseness in the parameter space for fast testing. However, most of the existing optimization methods for training l1penalized KLR do not scale well in large-scale settings. In this work, we present highly scalable training of KLR model via three first-order optimization methods: Fast Shrinkage Thresholding Algorithm(FISTA), Coordinate Gradient Descent (CGD), and a variant of Stochastic Gradient Descent (SGD) method. To further reduce the space and time complexity, we apply a simple kernel linearization technique which achieves similar results at a fraction of the computational cost. While SGD appears the fastest in training large-scale data, we show that CGD performs considerably better in some cases on various quality measures. Based on this observation, we propose a multi-scale extension of FISTA which improves its computational performance significantly in practice while preserving the theoretical global convergence rate. We further propose a two-stage active set training scheme for CGD and FISTA, which boosts the prediction accuracies by up to 4%. Extensive experiments on several data sets containing up to millions of samples demonstrate the effectiveness of our approach.
منابع مشابه
Efficient Online Learning for Large-Scale Sparse Kernel Logistic Regression
In this paper, we study the problem of large-scale Kernel Logistic Regression (KLR). A straightforward approach is to apply stochastic approximation to KLR. We refer to this approach as non-conservative online learning algorithm because it updates the kernel classifier after every received training example, leading to a dense classifier. To improve the sparsity of the KLR classifier, we propose...
متن کاملKernel Logistic Regression Algorithm for Large-Scale Data Classification
Kernel Logistic Regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in large-scale data classification problems and this is mainly because it is computationally expensive. In this paper, we present a new KLR algorithm based on Truncated Regularized Iteratively Reweighted Least Squares(TR-IRLS)...
متن کاملA Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression
l1-regularized logistic regression, also known as sparse logistic regression, is widely used in machine learning, computer vision, data mining, bioinformatics and neural signal processing. The use of l1 regularization attributes attractive properties to the classifier, such as feature selection, robustness to noise, and as a result, classifier generality in the context of supervised learning. W...
متن کاملFast Implementation of l 1 Regularized Learning Algorithms Using Gradient Descent Methods ∗
With the advent of high-throughput technologies, l1 regularized learning algorithms have attracted much attention recently. Dozens of algorithms have been proposed for fast implementation, using various advanced optimization techniques. In this paper, we demonstrate that l1 regularized learning problems can be easily solved by using gradient-descent techniques. The basic idea is to transform a ...
متن کاملFast Implementation of ℓ1Regularized Learning Algorithms Using Gradient Descent Methods
With the advent of high-throughput technologies, l1 regularized learning algorithms have attracted much attention recently. Dozens of algorithms have been proposed for fast implementation, using various advanced optimization techniques. In this paper, we demonstrate that l1 regularized learning problems can be easily solved by using gradient-descent techniques. The basic idea is to transform a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011